Back

Annals of Internal Medicine

27 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
Progressively Widening Healthcare Costs in Long COVID Over Five Years
2026-02-26 public and global health 10.64898/2026.02.24.26346985
#1 (1.9%)
Show abstract

BackgroundLong COVID affects millions worldwide, yet the long-term trajectory of healthcare costs remains poorly characterized. Prior studies with limited follow-up have documented elevated but stable excess costs, leaving uncertainty about whether the economic burden attenuates or persists over time. MethodsWe conducted a retrospective cohort study using electronic health record data from 12 hospitals and 20 community health centers (January 2018 through December 2024). Adults with documented ...

2
Revised estimates of the types and durations of long Covid symptoms based on claims records from 245 Million US patients
2026-02-18 epidemiology 10.64898/2026.02.17.26346448
#1 (1.8%)
Show abstract

COVID-19 has been shown to cause a range of harmful long-term effects on nearly every organ system1-3. These findings are based on retrospective studies comparing COVID-19 patients to patients with similar medical histories and demographics but no COVID-19 diagnosis4-16. However, concerns have emerged that these comparisons may be biased if COVID-19 patients had unrelated health conditions or other factors not recorded in their medical records17-21. Here, using a massive dataset of 14.4 billion ...

3
Can AI Match Human Experts? Evaluating LLM-Generated Feedback on Resident Scholarly Projects
2026-03-04 medical education 10.64898/2026.03.04.26346878
Top 0.5% (1.0%)
Show abstract

BackgroundDelivering timely, high-quality feedback on resident scholarly projects is labour-intensive, especially in large programmes. We developed an AI-assisted evaluation system, powered by the open-weight LLaMA-3.1 large-language model (LLM), to generate formative feedback on Family Medicine residents scholarly projects and compared its performance with expert human evaluators. MethodsWe evaluated whether the AI-generated feedback achieves comparable quality to expert feedback. The tool ing...

4
Large Language Models Readability Classification: A Variability Analysis of Sources and Metrics
2026-03-02 public and global health 10.64898/2026.02.20.26346638
Top 0.6% (1.0%)
Show abstract

AbstractAccurate health information is ineffective if patients cannot understand it. Large Language Model (LLM) health research values veridical precision; however, linguistic accessibility remains an under-examined component of output quality and usability. This study investigated two sources of variability in readability classification: differences across LLM systems and across readability metrics. The analysis tested 1,120 data points from seven systems in English and Portuguese, comparing ba...

5
Exploratory analyses of Immunologic Features in a Randomized, Placebo-Controlled Trial of Nirmatrelvir/Ritonavir for Long COVID
2026-02-26 public and global health 10.64898/2026.02.24.26347001
Top 0.6% (1.0%)
Show abstract

This exploratory analysis of PAX LC, a Phase 2, 1:1 randomized, double-blind, superiority, placebo-controlled trial examined whether treatment with nirmatrelvir/ritonavir (NMV/r) versus placebo/ritonavir (PBO/r) in individuals with Long COVID could reveal immune features associated with symptom improvement. Eighty-two participants (n=45 PBO/r; n=37 NMV/r) provided blood samples at baseline (Day 0) and post-treatment (Day 28). Baseline demographic and immunological phenotypes were similar in the ...

6
Graph-Augmented Retrieval for Digital Evidence-Based Medical Synthesis: A Proof-of-Concept Study on Topology-Aware Mechanistic Narrative Generation
2026-02-19 health systems and quality improvement 10.64898/2026.02.18.26346545
Top 0.7% (0.9%)
Show abstract

BackgroundRetrieval-augmented generation (RAG) frameworks such as RAPID [1] have demonstrated that staged planning and retrieval grounding improve long-form text generation. However, most implementations remain similarity-driven and open-domain, lacking the epistemic safeguards required for biomedical synthesis, where mechanistic completeness, temporal governance, traceability, and explicit gap classification are essential. ObjectiveTo develop and evaluate a topology-aware, graph-augmented retr...

7
Risk of new-onset obstructive sleep apnea up to 4.5 years after COVID-19 in the urban population.
2026-02-15 infectious diseases 10.64898/2026.02.12.26346136
Top 0.8% (0.9%)
Show abstract

RationaleObstructive sleep apnea (OSA) is linked to cardiovascular, metabolic, and cognitive morbidity. Although COVID-19 has been associated with long-term respiratory and neurological sequelae, its role in precipitating new-onset OSA remains unclear. ObjectivesTo evaluate whether SARS-CoV-2 infection increases risk of developing OSA up to 4.5 years post-infection and how risk varies by hospitalization status, demographics, comorbidities, and vaccination status. MethodsThis retrospective coho...

8
Associations between SARS-CoV-2 Infection and Multidimensional Sleep Health
2026-02-25 infectious diseases 10.64898/2026.02.19.26346546
Top 0.8% (0.9%)
Show abstract

PuhrposeTo evaluate the short- and long-term cross-sectional associations between COVID-19 infection and multidimensional sleep health. MethodsData from the COVID-19 Outbreak Public Evaluation (COPE) initiative were used to examine the association between a novel multidimensional sleep health measure (COPE Multidimensional Sleep Health Scale, CMSHS) modeled from the RuSATED instrument and (1) COVID-19 infection and (2) post-acute sequelae of SARS-CoV-2 infection (PASC). ResultsData from 11,326...

9
Antibiotic coverage in biliary-stented pancreatoduodenectomy: Real-world evidence supporting piperacillin tazobactam over ampicillin sulbactam
2026-02-14 infectious diseases 10.64898/2026.02.12.26346173
Top 0.9% (0.9%)
Show abstract

BackgroundPreoperative biliary stenting alters biliary colonization and may reduce the effectiveness of perioperative antibiotic prophylaxis in pancreatoduodenectomy. Although broader-spectrum regimens have been associated with improved infectious outcomes, their microbiological adequacy in routine clinical practice remains poorly defined. We therefore evaluated the real-world adequacy of a prolonged ampicillin-sulbactam protocol, its association with infectious outcomes and survival, and the po...

10
Admission Predictors of In-Hospital Mortality and the Combined Outcome of Death or Invasive Mechanical Ventilation in Patients with COVID-19 During the Pre-Vaccination Era: A Retrospective Cohort Study
2026-03-03 infectious diseases 10.64898/2026.02.28.26347308
Top 1% (0.7%)
Show abstract

BackgroundReliable identification of early predictors of adverse outcomes was essential during the pre-vaccination phase of the COVID-19 pandemic. Few studies have comprehensively integrated clinical presentation, laboratory parameters including arterial blood gas analysis, and chest computed tomography (CT) findings within a single well-characterized cohort, particularly in underrepresented regions of Brazil. MethodsThis retrospective cohort study included 482 consecutive adults (median age 61...

11
Evaluating a Locally Deployed 20-Billion Parameter Large Language Model for Automated Abstract Screening in Systematic Reviews
2026-03-04 health informatics 10.64898/2026.03.04.26347506
Top 1% (0.7%)
Show abstract

BackgroundSystematic reviews (SRs) are essential for evidence-based medicine but require extensive time and resources for abstract screening. Large language models (LLMs) offer potential for automating this process, yet concerns about data privacy, intellectual property protection, and reproducibility limit the use of cloud-based solutions in research settings. ObjectiveTo evaluate the performance of a locally deployed 20-billion parameter LLM for automated abstract screening in systematic revi...

12
Leveraging large language models to address common vaccination myths and misconceptions
2026-03-02 health informatics 10.64898/2026.02.27.26347254
Top 1% (0.7%)
Show abstract

Large language models (LLMs) are increasingly used by the public to seek health information, yet their reliability in addressing common vaccine myths remains unclear. We conducted an exploratory multi-vendor evaluation of three LLMs (GPT-5, Gemini 2.5 Flash, Claude Sonnet 4) using officially curated vaccination myths from Germanys public health institution and two realistic user framings as prompts: a curious skeptic and a convinced believer. All model responses were independently evaluated by t...

13
Cultryx: Precision Diagnostic Stewardship for Blood Cultures Using Machine Learning
2026-03-04 infectious diseases 10.64898/2026.02.27.26347214
Top 1% (0.7%)
Show abstract

BackgroundThe 2024 blood culture bottle shortage brought diagnostic resource allocation to the forefront, reflecting persistent, foundational challenges with low-value testing and empiric treatment approaches under clinical uncertainty. ObjectiveTo determine whether a machine learning approach using electronic medical record data can predict bacteremia more effectively than existing systems and practices to guide diagnostic testing and empiric treatment strategies. MethodsIn a retrospective co...

14
Boards-style benchmarks overestimate prior-chat bias in large language models: a factorial evaluation study
2026-02-14 health informatics 10.64898/2026.02.12.26346164
Top 1% (0.7%)
Show abstract

BackgroundLarge language models (LLMs) are increasingly piloted as chat interfaces for chart review and clinical decision support. Although leading models achieve and even exceed physician-level accuracy on exam-style benchmarks such as MedQA, recent perturbation studies show large drops in accuracy after small changes to prompts, distractor content, or answer format. Prior work has not systematically examined how these vulnerabilities unintentionally manifest in clinically realistic settings, i...

15
Repeated histological diagnoses and kidney graft failure: an observational cohort study
2026-02-18 transplantation 10.64898/2026.02.17.26346474
Top 2% (0.7%)
Show abstract

BackgroundThe effects of Banff histological diagnoses on kidney transplant outcome have been well characterized. However, repeated observation of such histological injury across multiple biopsies in kidney transplant recipients remains insufficiently explored. MethodsIn an observational cohort (N=1819 transplantations with 5736 post-transplant biopsies, recurrent event survival models quantified transitions between diagnoses of T-cell mediated rejection (TCMR), antibody-mediated rejection (AMR)...

16
Revisiting the Area Deprivation Index
2026-02-28 health policy 10.64898/2026.02.26.26346490
Top 2% (0.5%)
Show abstract

ObjectiveTo re-estimate and re-validate the Area Deprivation Index to address recent criticism of the existing index, which is calculated and distributed by Neighborhood Atlas. Data SourcesTo calculate the updated Area Deprivation Index (ADI), we obtained 17 census measures from the 2018-2022 American Community Survey (ACS) 5-year data that reflected poverty, housing, employment, and education within census block groups, census tracts, and counties. To validate the association of the updated in...

17
Show Your Work: Verbatim Evidence Requirements and Automated Assessment for Large Language Models in Biomedical Text Processing
2026-03-04 health informatics 10.64898/2026.03.03.26346690
Top 2% (0.5%)
Show abstract

PurposeLarge language models (LLMs) are used for biomedical text processing, but individual decisions are often hard to audit. We evaluated whether enforcing a mechanically checkable "show your work" quote affects accuracy, stability, and verifiability for trial eligibility-scope classification from abstracts. MethodsWe used 200 oncology randomized controlled trials (2005 - 2023) and provided models with only the title and abstract. Trials were labeled with whether they allowed for the inclusio...

18
Fully Automated Systematic Review Generation via Large Language Models: Quality Assessment and Implications for Scientific Publishing
2026-02-23 health informatics 10.64898/2026.02.18.26346559
Top 2% (0.5%)
Show abstract

Large language models (LLMs) are increasingly transforming scientific workflows, yet their application to rigorous evidence synthesis remains underexplored. Through the execution of a single Python script, we present a fully automated pipeline leveraging the Claude API to generate systematic reviews from literature search through manuscript completion without human intervention. Our pipeline processes hundreds of papers through iterative API calls for inclusion evaluation, information extraction...

19
Reclaiming health: a qualitative, explorative study of long covid recovery journeys involving mind-body approaches.
2026-02-23 infectious diseases 10.64898/2026.02.21.26345052
Top 3% (0.5%)
Show abstract

ObjectiveThis study explored the recovery experiences of individuals who report having (largely) recovered from long covid and who attributed their improvement to mind-body approaches. Design, setting and participantsWe conducted an explorative qualitative study using purposive recruitment through social media and snowball sampling. Eighteen adult women (aged 37-62 years), who self-identified as having had long covid and having substantially recovered through mind-body approaches participated i...

20
Digital Adherence Support for Tuberculosis Treatment: A Multicentre Randomized Trial in Kenya
2026-02-14 infectious diseases 10.64898/2026.02.11.26346015
Top 3% (0.5%)
Show abstract

BackgroundImproving tuberculosis (TB) treatment success is critical for improving the health of individuals with TB, reducing transmission, and lowering treatment costs. We conducted a four-arm randomized controlled trial (RCT) to evaluate whether three digital interventions with increasing support improved treatment outcomes compared to the standard of care. MethodsIn this open-label, parallel RCT in Kenya, all TB patients at 902 participating clinics who had at least 2 months of treatment rem...